(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
13 April 2006 (13.04.2006) 




PCT 



(10) International Publication Number 

WO 2006/039009 A2 



(51) International Patent Classification: 
A61B 5/05 (2006.01) 

(21) International Application Number: 

PCT/US2005/030081 

(22) International Filing Date: 22 August 2005 (22.08.2005) 

(25) Filing Language: English 

(26) Publication Language: English 

(30) Priority Data: 

10/957,467 30 September 2004 (30.09.2004) US 

(71) Applicant (for all designated States except US): ACCU- 
RAY INC. [US/US]; 1310 Chesapeake Terrace, Sunny- 
vale, CA 94089 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): THOMSON, Euan 

[GB/US]; 15624 Benedict Lane, Los Gatos, CA 95032 
(US). DOOLEY, John, Robinson [US/US]; 21007 Sher- 
man Drive, Castro Valley, CA 94552 (US). KUDUVALLI, 
Gopinath [CA/US]; 3988 Williams Road, San Jose, 
CA 95117 (US). WANG, James [US/US]; 1806 Mark 
Twain Street, Palo Alto, CA 94303 (US). EARNST, Eric 
[US/US]; 12759 Saratoga Woods Circle, Saratoga, CA 
95070 (US). RAANES, Chris [US/US]; 50 Bear Gulch 
Drive, Portola Valley, CA 94028 (US). 



(74) Agents: VINCENT, Lester, J. et al.; Blakely, Sokoloff, 
Taylor & Zafman LLP, 12400 Wilshire Boulevard, 7th' 
Moor, Los Angeles, CA 90025 (US). 

(81) Designated States (unless otherwise indicated, for every 
kind of national protection available): AE, AG, AL, AM, 
AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, 
CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, 
GB, GD, GE, GH, GM, MR, IIU, ID, 1L, IN, IS, JP, KE, 
KG, KM, KR KR, KZ, LC\ LK, LR, LS, LT, LU, LV, MA, 
MD, MG, MK, MN, MW, MX, MZ, NA, NG, NL NO, NZ, 
OM, PG, PI I, PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, 
SM, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, 
VN, YU, ZA, ZM, ZW. 

(84) Designated States (unless otherwise indicated, for every 
kind of regional protection available): ARIPO (BW, GH, 
GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM, 
ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, 
FR, GB, OR, HU, IE, IS, IT, LT, LU, LV, MC, NL, PL, PT, 
RO, SE, SI, SK, TR), OAPI (BF, BJ, CP, CG, CI, CM, GA, 
GN, GQ, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of thai report 

For two-letter codes and other abbreviations, refer to tfie "Guid- 
ance Notes on Codes and Abbreviations " appearing at the beg in- 
ning of each regular issue of the PCT Gazette. 



< 
Os 

o 
o 

Os 

o 

o 
o 

o 



(54) Title: DYNAMIC TRACKING OF MOVING TARGETS 

(57) Abstract: Treatment targets such as tumors or lesions, located within an anatomical region that undergoes motion (which may 
be periodic with cycle P), are dynamically tracked. A 4D mathematical model is established tor the non-rigid motion and deformation 
of the anatomical region, from a set of CT or other 3D images. The 4D mathematical model relates the 3D locations of part(s) of the 
anatomical region with the targets being tracked, as a function of the position in time within P. Using fiducialless non-rigid image 
registration between pre-operative DRRs and intra-operative x-ray images, the absolute position of the target and/or other part(s) of 
the anatomical region is determined. The cycle P is determined using motion sensors such as surface markers. The radiation beams 
are delivered using: 1) the results of non-rigid image registration; 2) the 4D model; and 3) the position in time within P. 
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DYNAMIC TRACKING OF MOVING TARGETS 



BACKGROUND 

[0001] In some medical applications, it may be necessary to dynamically 
track targets that move with time. For example, in radiosurgery it may be 
desirable to dynamically track tumors and/or lesions in the human body 
that move with respiration and/or heartbeat. In radiosurgery, accurate 
trajectories of the radiation beams through the patient anatomy to the 
lesion or tumor being treated can be critical, in order to achieve the 
radiation dose distribution that was computed during treatment planning 
time. For regions of the human anatomy that move, for example due to 
breathing or heartbeat, it is important to take such motions into 
consideration, when computing the effect of the motion on the treatment 
plan being generated. Dynamic tracking may also be useful in 
applications other than radiosurgery in which parts of the anatomy move, 
due to breathing, heartbeat, or any other type of motion. 
[0002] Fiducial markers have been used in the past, in order to track 
moving regions of the anatomy. Fiducials-based tracking can be difficult 
for a patient, for a number of reasons. For example, high accuracy tends 
to be achieved by using bone-implanted fiducial markers, but less invasive 
techniques such as skin-attached markers or anatomical positions tend to 
be less accurate. Implantation of fiducials into a patient is generally 
painful and difficult, especially for the C-spine, the implantation process 
for which may frequently lead to clinical complications. 
[0003] In some methods that use gating to handle anatomical motion, 
dynamic tracking may be achieved by establishing a relationship between 
internally implanted fiducials, and externally placed markers that are 
tracked in real time. These methods do not take into account the non- 
rigid motions and deformations of the surrounding anatomy, as a function 
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of the motion cycle. 

[0004] A method and system that address these deficiencies are thus 
desirable. In particular, it is desirable to provide a reliable and efficient 
method and system for dynamically tracking moving targets. 

SUMMARY 

[0005] A method and system are presented for dynamically tracking one 
or more targets within an anatomical region so that desired doses of 
therapeutic radiation can be delivered to the targets, while the anatomical 
region is undergoing motion. The targets may be tumors or lesions, for 
example. The anatomical region may include one or more reference 
structures, in addition to the targets. The locations of the targets within 
the moving anatomical region are determined relative to the reference 
structures, while taking into account the deformation that the anatomical 
region undergoes during its motion. The deformation of the anatomical 
region includes non-rigid deformations, as well as rigid deformations. 
[0006] In one embodiment, the motion of the anatomical region may be 
periodic. Examples of such periodic motion may include, for example, 
respiration or heartbeat. The periodic motion may be characterized by a 
periodic cycle P. In other embodiments, the motion of the anatomical 
region may be aperiodic motion, in which the cycle P may change over 
time, or for which no cycle P is defined. Throughout this patent 
application, the terms "periodic motion" and "periodic motion with cycle P" 
include periodic motions having a time-varying period P(t). In other words, 
the terms "periodic motion" and "periodic motion with cycle P" should be 
understood as being descriptive of, and referring to, the changing nature 
of the motions in the human anatomy. 

[0007] In one embodiment, dynamic tracking of a periodic motion with 
cycle P is accomplished by constructing a 4D (four dimensional) 
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mathematical model that describes the non-rigid deformation and non- 
rigid motion of the anatomical region, as a function of the relative position 
in time within the cycle P. The 4D mathematical model may be 
constructed by generating a plurality of images (including but not limited to 
CT images) lj (j = 1, . . . , p) of the anatomical region, where each image is 
taken at one of a succession of time points tj (j = 1 , . . . p) within the cycle 
P. The continuous non-rigid deformation of the anatomical region, as a 
function of the cycle P, is mathematically modeled by morphing a CT 
image, acquired at one instant in time, into another CT image acquired at 
a subsequent instant in time. The 4D mathematical model relates the 3D 
(three dimensional) locations of the skeletal structures with the 3D 
locations of the targets, as a function of the instant in the cycle P (which 
may be, for example, a breathing cycle or a heartbeat cycle). 
[0008] In one embodiment, the 4D mathematical model of the non-rigid 
deformation is used by a treatment plan generator to generate a radiation 
dose distribution that prescribes a desired amount of the therapeutic 
radiation to be delivered, in real time, to the targets within the moving 
anatomical region. The radiation dose distribution accounts for the non- 
rigid deformation of the moving anatomical region during its periodic 
motion. 

[0009] The cycle P for the periodic motion may be determined for example 
by dynamically tracking motion sensors (such as surface markers or LEDs 
(light emitting diodes), by way of example) attached to the skin or exterior 
surface of the anatomical region. The relative position within the motion 
cycle P of any desired time point, including the time point tj at which the j- 
th image lj was taken, can be determined in this way. 
[0010] In this patent, the term "real time" refers to a time scale that is 
substantially simultaneous to the actual radiation treatment and delivery. 
For example, tracking that occurs at a speed of at least several Hz or 



3 



WO 2006/039009 



PCT/US2005/030081 



higher falls within the time scale described as "real time" in this patent. In 
this patent, the term "near real time" refers to a time scale that is slower 
than the time scale referred to as "real time," for example by about one or 
more orders of magnitude. For example, events occurring at a speed of 
less than about once a second falls within the time scale described as 
"near real time" in this patent. 

[001 1] In one embodiment, the near real time locations of the reference 
structures are determined by performing fiducial-less tracking of the 
reference structures. A set of DRRs are generated from the CT image 
data, using the same beam projection geometry as is used to acquire the 
live x-ray images. The live x-ray images, acquired intra-operatively in 
near real time, are registered with the DRRs (digitally reconstructed 
radiographs). In one embodiment, a non-rigid image registration 
algorithm may be used, in which the non-rigid transformation parameters 
are derived from a 3D full motion field. The full motion field may be 
constructed by estimating many local motion vectors. In one embodiment, 
multi-level block matching may be used in the non-rigid image registration 
algorithm, in conjunction with a similarity measure based on pattern 
intensity. In one embodiment, hierarchical mesh motion estimation may 
be performed. 

[0012] In one embodiment, the locations of the targets can be determined 
in real time, by correlating the known near real time locations of the 
reference structures, which were determined using non-rigid image 
registration, with the target locations that are being sought, using the 4D 
mathematical model that describes how the locations of the targets are 
related to the locations of the reference structures, as a function of the 
relative position in time within the cycle P. The information from the 
motion sensors can be used to determine the relative position within the 
cycle P of any desired time point. 
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[0013] In one embodiment, radiation treatment is delivered to the real time 
locations of the targets, in accordance with the radiation dose distribution 
generated by the treatment planning generator. The radiation beam 
delivery can be synchronized to the instant in breathing cycle determined 
in the treatment planning. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] FIG. 1 provides a schematic flow chart of a method for dynamic 
fiducial-less tracking of moving targets. 

[0015] FIG. 2 schematically illustrates the acquisition of pre-operative 
images (e.g. 3D CT scans) of a moving target within a patient's anatomy, 
taken at different times points within a breathing cycle of the patient. 
[0016] FIG. 3 schematically illustrates the tracking of the 3D motion of the 
target as a function of the breathing cycle of a patient, using one of: a) a 
rigid reference structure that does not move with breathing; and b) 
multiple reference structures that themselves move with breathing. 
[0017] FIG. 4 schematically illustrates fiducial-less tracking of a moving 
target during delivery of treatment radiation, by registering 2D near real 
time x-ray images of the target with DRRs generated from a pre-operative 
3D scan taken at a specific time point within the breathing cycle of the 
patient. 

[0018] FIG. 5 illustrates a flowchart of a non-rigid image registration 
algorithm that may be used for a 2D/3D registration between pre- 
operative 2D DRRs, and intraoperative x-ray projection images. 
[0019] FIG. 6 schematically illustrates the use of block matching when 
estimating local motion for a point of interest within a target in a patient. 
[0020] FIG. 7 schematically illustrates a multi-resolution image 
representation, for implementing multi-level block matching in one 
embodiment, using multiple candidates. 
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[0021] FIG. 8 schematically illustrates a neighborhood R for calculating a 
similarity measure based on pattern intensity. 

[0022] FIG. 9 illustrates the estimation of global translation, between the 
image center of a DRR and the image center of a corresponding x-ray 
image. 

[0023] FIG. 10 schematically illustrates a mesh grid established for a DRR 
of a target, and a corresponding mesh grid established for an x-ray image 
of the target, in an embodiment in which the target is located within the 
cervical region of the spine. 

[0024] FIG. 1 1 illustrates a mesh hierarchy, during mesh motion 
estimation, the mesh hierarchy starting from a relatively coarse mesh, and 
progressing onto finer meshes. 

[0025] FIG. 12 schematically illustrates the passing on of node estimation, 
from a coarse mesh resolution level onto a finer mesh resolution level. 
[0026] FIG. 13 schematically illustrates the determination of a motion 
vector for a point of interest, by interpolation from surrounding nodes. 
[0027] FIG. 14 schematically illustrates, in vectorial form, a full motion field 
reconstructed from many estimated local motion vectors. 
[0028] FIG. 15 illustrates the geometric relations between a three- 
dimensional treatment target, and two orthogonal 2D x-ray projections 
[0029] FIG. 16 schematically illustrates dynamic tracking of a moving 
target within a patient during real time delivery of treatment radiation, 
using surface markers (e.g. LEDs) to monitor the breathing motion of the 
patient. 

[0030] FIG. 17 provides a schematic block diagram of a system for 
dynamically tracking targets within an anatomical region that is 
undergoing periodic motion (e.g. respiration or heartbeat) and delivering 
therapeutic radiation to the moving targets. 
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DETAILED DESCRIPTION 

[0031] A number of techniques are described for dynamically tracking 
tumors/lesions in the human body that is undergoing motion. The 
methods are of principal use in radiosurgery, but may also be useful in 
other applications where it may be necessary to dynamically track parts of 
the anatomy that move, for example because of respiration or heartbeat. 
[0032] Some portions of the detailed description that follows are presented 
in terms of algorithms and symbolic representations of operations on data 
bits within a computer memory. These algorithmic descriptions and 
representations are the means used by those skilled in the data 
processing arts to most effectively convey the substance of their work to 
others skilled in the art. An algorithm is conceived to be a self-consistent 
sequence of acts leading to a desired result. These acts require physical 
manipulations of physical quantities. Usually, though not necessarily, 
these quantities may take the form of electrical or magnetic signals 
capable of being stored, transferred, combined, compared, and otherwise 
manipulated. It has proven convenient at times, principally for reasons of 
common usage, to refer to these signals as bits, values, elements, 
symbols, characters, terms, numbers, or the like. 
[0033] It should be borne in mind, however, that all of these and similar 
terms are to be associated with the appropriate physical quantities and 
are merely convenient labels applied to these quantities. Unless 
specifically stated otherwise, it is appreciated that throughout the 
descriptions below, discussions that utilize terms such as "processing" or 
"computing" or "calculating" or "determining" or "displaying" or the like, 
refer to the actions and processes of a computer system, or a similar 
electronic computing device. The computer system manipulates and 
transforms data, represented as physical or electronic quantities within the 
computer system's registers and memories, into other data similarly 
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represented as physical quantities within the computer system's 
memories or registers, or within other such information storage, 
transmission or display devices. 

[0034] The methods and techniques described below can be implemented 
by an apparatus for performing the operations discussed below. Such an 
apparatus may be specially constructed for the required purposes, or it 
may comprise a general purpose computer, selectively activated or 
reconfigured by a computer program stored in the computer. Such a 
computer program may be stored in a computer readable storage medium, 
such as, but not limited to, any type of disk including floppy disks, optical 
disks, CD-ROMs, and magnetic-optical disks, read-only memories 
(ROMs), random access memories (RAMs), EPROMs, EEPROMs, 
magnetic or optical cards, or any type of media suitable for storing 
electronic instructions. Each such computer readable storage medium 
may be coupled to a computer system bus. 

[0035] The algorithms and displays presented herein are not inherently 
related to any particular computer or other apparatus. Various general 
purpose systems may be used with programs that are designed in 
accordance with the teachings below, or it may prove convenient to 
construct a more specialized apparatus to perform the requisite methods 
and techniques. For example, any of the methods described below can 
be implemented in hard-wired circuitry, or by programming a general 
purpose processor, or by any combination of hardware and software. 
One of skill in the art will appreciate that the methods and techniques 
described below can be practiced with a wide variety of computer system 
configurations, including hand-held devices, multiprocessor systems, 
microprocessor-based or programmable consumer electronics, network 
PCs, minicomputers, mainframe computers, and the like. The methods 
and techniques described below can also be practiced in distributed 
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computing environments where tasks are performed by remote 
processing devices that are linked through a communications network. 
[0036] The methods and systems described below may be implemented 
using computer software. Computer software may be referred to using 
different terms, for example a program, a procedure, or an application. If 
written in a programming language conforming to a recognized standard, 
sequences of instructions designed to implement these methods and 
systems can be compiled for execution on a variety of hardware platforms 
and for interface to a variety of operating systems. Also, these methods 
and systems are not described with reference to any particular 
programming language. It will be appreciated that a variety of 
programming languages may be used to implement the teachings of the 
invention as described herein. Furthermore, it is common in the art to 
speak of software, in one form or another, as taking an action or causing a 
result. Such expressions are merely a shorthand way of saying that the 
execution of the software by a computer causes one or more processors 
in the computer to perform an action or produce a result. 
[0037] FIG. 1 provides a schematic flow chart 100 describing the dynamic 
fiducial-less tracking of moving targets. The targets are located within an 
anatomical region. The targets may be tumors or lesions, for example, or 
organs of interest. The anatomical region typically includes one or more 
reference structures, in addition to the targets. In one embodiment of the 
dynamic tracking method and system, the reference structures may be 
skeletal (i.e. bony) structures. In another embodiment, the reference 
structures may be other natural anatomical structures, including but not 
limited to cartilages or other (typically rather dense) organs. In yet 
another embodiment, the reference structures may be artificial structures, 
for example fiducials or surgical hardware. 

[0038] As mentioned earlier, throughout this description the term "periodic 
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motion with cycle P" should be understood as including periodic motions 
in which the cycle P of the periodic motion, as well as the amplitude and 
waveform of the motion, change with time, In other words, the term 
"periodic motion" or "periodic motion with cycle P" should be understood 
as also referring to the changing nature of the motions in the human 
anatomy. 

[0039] As mentioned earlier, the anatomical region may undergo a 
deformation (which may be a non-rigid deformation) during its motion. 
While in the embodiment illustrated in FIG. 1, the anatomical region is 
described as moving periodically while undergoing a non-rigid deformation, 
any other type of motion (e.g. aperiodic motion) and any type of 
deformation of the anatomical region may be tracked, using the method 
and system described in this patent. 

[0040] In overview, in the embodiment illustrated in FIG. 1 the locations of 
the targets, within the periodically moving and non-rigidly deforming 
anatomical region are determined, in steps 105-160. A radiation dose 
distribution is computed that results from a substantially continuous 
delivery of radiation through the non-rigidly moving and deforming 
anatomical region. Finally, in step 170, radiation is delivered in real time 
to the targets, in accordance with the computed radiation dose distribution. 
[0041] As a first step, a set of CT images lj (j = 1 , . . . p) are acquired in 
step 105, each CT image taken at one of a succession of time points tj (j = 
1 .... p) within the cycle P. In step 1 1 0, the cycle of the periodic motion 
(e.g., the respiratory cycle or the heartbeat cycle) is established, for 
example by dynamic tracking of markers or other sensors attached to the 
skin of the anatomical region. 

[0042] In step 120, a 4D (3D + time) mathematical model is constructed 
from these CT scans and from the motion cycle information obtained from 
the sensors. The mathematical model describes a non-rigid motion and 
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deformation of the anatomical region as it undergoes its periodic motion 
(e.g. respiration), as a function of the instant in the motion cycle P. The 
4D mathematical model relates the locations of the targets to the locations 
of the reference structures, as a function of the relative position in time 
within the cycle P, within the periodically moving anatomical region. More 
generally, this 4D model may describe the 3D + time relationship between 
a part or parts of the anatomy, and a target to be tracked for radiosurgery, 
as a function of the instant in time in the breathing cycle. 
[0043] Next, in steps 130, 135, and 140 the absolute position of a part or 
parts of the anatomy is determined, using x-ray imaging and 2D-to-3D 
image registration. In these steps, the locations of the reference 
structures and/or the targets are determined in near real time, by fiducial- 
less tracking of the reference structures and/or targets using a non-rigid 
image registration algorithm. 

[0044] As explained earlier, in this patent the term "real time" refers to a 
time scale that is substantially simultaneous to the actual radiation 
treatment and delivery. In just one exemplary embodiment, intra- 
operative real-time tracking of a target may be implemented using optical 
markers which track at about a 30 Hz rate, and using a model which 
updates its predictions at about a 80 Hz rate. This is intended to be an 
illustrative example only, and real time tracking may occur at a wide range 
of different tracking speeds, generally higher than about 1 Hz. The term 
"near real time" refers to a time scale that is slower, for example by about 
one or more orders of magnitude, than the time scale described by the 
term "real time." As an example, the time scale for acquiring x-ray images, 
which may range from about a fraction of a second to about several 
seconds, will be described as "near real time." 

[0045] A set of DRRs of the anatomical region are generated, in step 130, 
from the CT images acquired in step 105. In step 135, live x-ray 
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projection images of the anatomical region are acquired. In step 140, the 
DRRs of the reference structures are registered with the near real time x- 
ray images of the reference structures. 

[0046] Step 150 describes 4D treatment planning, in which a radiation 
dose distribution is computed that results from continuous beam delivery 
through the non-rigidly moving and deforming anatomical region. In this 
step, the radiation beam trajectories are determined, using the knowledge 
(obtained in step 140) of the absolute locations of the reference structures, 
and using the 4D model that relates the reference structures to the instant 
in breathing cycle (as determined using information from the sensors), 
and to the targets whose locations are being tracked. 
[0047] In step 160, the target locations are determined. In one 
embodiment, the 4D mathematical model may be used, together with the 
absolute positions of the skeletal structures as determined by fiducial-less 
tracking, and the information obtained from the sensors regarding the 
motion cycle, to determine the target locations. In an alternative 
embodiment, the target locations may be determined by 2D/3D non-rigid 
image registration, during which the DRRs (generated during the 
treatment planning stage) are registered onto near real time x-ray images. 
Finally, in step 170, radiation is delivered to the targets in accordance with 
the radiation dose distribution generated through 4D treatment planning. 
[0048] FIG. 2 schematically illustrates the acquisition of preoperative 
images (e.g. 3D CT scans) of a moving target within a patients anatomy. 
In the illustrated embodiment, the target moves due to breathing motion of 
the patient. While the illustrated embodiment shows 3D CT scans, any 
other type of 3D scans may be performed, including but not limited to 3D 
MRI (magnetic resonance imaging) scans, 3D PET (positron emission 
tomography) scans, and 3D ultrasound scans. In the illustrated 
embodiment, a set of CT scans are taken at different times points tj Q = 
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1, ... , k, ... , I, m, .... p) within a breathing cycle P of the patient. In the 
illustrated embodiment, U < k < ti < t m < tp. The time points tj correspond 
to different epochs in the patient breathing cycle. The cycle P is 
monitored by an external sensor, e.g. a breathing sensor. For example, a 
surface marker (such as an LED) or a similar device may be attached to 
the skin. In embodiments in which the target undergoes motions other 
than respiration, different types of sensors (e.g. a cardiac monitor when 
heartbeat is being monitored) may be used. 
[0049] In the illustrated embodiment, the CT images are taken at time 
points t k , t|, and t p , respectively. The epochs or time points within the 
breathing cycle P are preferably chosen to substantially encompass the 
overall dynamic range of the periodic motion. For example, in one 
embodiment, the time points may include: a time point t| corresponding to 
a trough of the cycle P; a time point t p corresponding to a peak of the 
cycle P; and a third time point t k disposed at an intermediate location 
between the peak and the trough of the cycle P. In other embodiments, 
the time points selected for taking the CT images may include more than 
the three time points tj, tp, and tk described above. 
[0050] From this set of CT studies, a 4D mathematical model is 
constructed that morphs the CT image acquired at one instant or time 
point in the motion cycle into another CT image acquired at a subsequent 
instant or time point in the motion cycle, providing a model for the 
continuous non-rigid deformation of the anatomy as a function of the 
motion cycle. In image processing, it is well known in the art to morph 
one image into another image, and to describe this in terms of a 
mathematical model. Any standard software and/or algorithms that are 
known and may be commercially available can be used. 
[0051] In one embodiment, the 4D mathematical model constructed from 
the set of CT images shown in FIG. 2 is used for 4D treatment planning, 
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i.e. to compute the dose distributions resulting from a continuous radiation 
beam delivery through the non-rigidly moving anatomy, taking into 
account the non-rigid motion and deformation of the treatment targets as 
a function of the position in time within the motion cycle. In this 
embodiment, 4D treatment planning consists of two parts: a) creating a 
mathematical model for the non-rigid deformations of the anatomy as a 
function of the instant in the motion cycle, as described above; and b) 
using the mathematical model to compute the dose distributions resulting 
from continuous radiation beam delivery through the non-rigidly moving 
anatomy. 

[0052] In order to compute the desired radiation dose distributions, the 
beam trajectories are initially defined with respect to a nominal patient 
coordinate system. In one embodiment, the nominal patient coordinate 
system may be chosen to orient with respect to one of the several CT 
images illustrated in FIG. 2 and acquired to cover the motion cycle. 
Different orientations may be chosen in other embodiments. In one 
embodiment, each radiation beam is turned on from the time 
corresponding to the point in time in which each CT image was taken, and 
remains on for a duration selected to give the desired dose distribution. 
The dose absorption is calculated as a function of time from the initial time 
point, taking into account the non-rigid deformations of the patient 
anatomy. 

[0053] In one embodiment, the 4D mathematical model relates the 3D 
locations of one or more reference structures with the 3D locations of the 
target, as a function of the instant in the motion cycle. In the 4D model, 
one or more of the selected reference structures may be stationary with 
respect to the motion cycle, while others of the selected reference 
structures may undergo non-rigid motion with respect to the motion cycle. 
[0054] FIG. 3 schematically illustrates the tracking of the 3D motion of the 
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target as a function of the motion cycle of a patient, using one of: a) a 
substantially rigid reference structure that is substantially stationary 
throughout the motion of the patient; and 

b) multiple reference structures that themselves move with the motion of 
the patient. Two kinds of models are possible: (a) 3D motion of the target 
(tumor/lesion) as a function of the motion cycle, with reference to a 
substantially rigid reference structure that is substantially stationary 
(including but not limited to vertebral structures) or (b) 3D motion of the 
target (e.g. tumor/lesion) as a function of the motion cycle, with reference 
to multiple reference structures that themselves move, along with the 
periodic motion of the anatomical region. 

[0055] In the embodiment illustrated in FIG. 3, the target 210 moves within 
an anatomical region 200 because of the periodic motion of the 
anatomical region 200. The reference structures are illustrated using 
reference numeral 230 and 232. The reference structures 230 
themselves move with breathing. On the other hand, reference structure 
232 is a substantially rigid reference structure that does not move with the 
breathing or other periodic motion of the anatomical region 210. 
[0056] The 3D location and orientation of the multiple skeletal structures 
enable vectors to be drawn that point from each skeletal structure to the 
tumor or lesion. A model that describes the 3D motion of the tumor/lesion 
as a function of the breathing cycle, with reference to a rigid skeletal 
structure (such as the skeletal structure indicated in FIG. 3 using 
reference numeral 210) that itself does not move with respect to breathing, 
may be used for example in conjunction with the vertebral structures for a 
patient lying supine for treatment, which is a practical example of such a 
non-moving skeletal structure. In another model, in which reference is 
made to rigid skeletal structures that themselves move with breathing, the 
3D motion of the tumor/lesion can be described as a compound function 
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of 1) the breathing cycle and the 2) location of multiple skeletal structures 
which themselves move with breathing. 

[0057] Once the relationship between the locations of the reference 
structures and the locations of the target is determined by the 4D model, 
as a function of the point in time within the periodic motion cycle, the 
absolute position of the reference structures is determined in near real 
time. In one embodiment, this is accomplished by 1) generating DRRs of 
the reference structures from the 3D CT images, which were shown in 
FIG. 2 and which were used to generate the 4D mathematical model; 2) 
taking "live" or near real-time x-ray images of the reference structures; 
and then 3) performing non-rigid image registration between the live x-ray 
images and the DRRs. Once the locations of the reference structures are 
determined using non-rigid image registration, the locations of the targets 
can be easily determined using the 4D mathematical model described 
above. 

[0058] FIG. 4 schematically illustrates fiducial-less tracking of reference 
structures. In one embodiment, this fiducial-less tracking is performed by 
registering 2D near real time x-ray images of the target with DRRs 
generated from a pre-operative 3D scan taken at a specific time point 
within the motion cycle of the patient. The position and orientation of the 
reference structures in the patient anatomy (at the time of treatment 
delivery) can be tracked with respect to the corresponding structures in 
one of the CT studies, using a 2D to 3D registration methods. In one 
embodiment, a non-rigid 2D/3D image registration is performed, described 
in detail below. 

[0059] As a first step in the 2D/3D non-rigid image registration process, a 
library of DRRs is generated for the projection geometry that will be used 
in the acquisition of live images at the time of treatment delivery. A pair of 
live (or "near real time") x-ray images of the patient is acquired during 
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treatment delivery, using the same projection geometry as used for 
generating the DRRs. The CT images used to generate the DRRs will 
correspond to one of the epochs in the motion cycle, typically the same 
one with respect to which beams are assigned to during treatment 
planning. The DRRs may be generated for several translations and 
orientations of the CT image, in order to cover the typical range of patient 
movements during treatment. 

[0060] The acquired images are registered with the DRRs using a feature 
recognition algorithm that tracks the reference structures. The image 
registration algorithm (described in detail below) may be repeated for 
multiple reference structures, to give the position and orientation of each 
structure with respect to the corresponding structures in the CT study 
(from which DRRs are made). In one embodiment, the difference in 
imaging characteristics of the tumor or lesion or a nearby anatomical 
region may be enhanced further by using high-sensitivity x-ray detectors. 
[0061] In one embodiment, the location of the tumor/lesion can be derived 
from the locations of the reference structures, determined by the non-rigid 
registration process. The location and orientation of multiple reference 
structures tracked using the fiducial-less algorithm are used to interpolate 
the location of the tumor/lesion, using their corresponding geometric 
relationships, as learned from the CT studies and the 4D model. In 
another embodiment, fiducial-less tracking is performed for the targets 
(e.g. tumors/lesions) themselves. If the target being tracked is sufficiently 
different in x-ray imaging characteristics relative to the surrounding tissue, 
the target itself can be directly tracked using the 2D-to-3D registration 
technique described below. 

[0062] A non-rigid 2D-to-3D image registration technique that accounts for 
non-rigid deformations of the anatomy, and which uses anatomical 
reference structures instead of fiducials, is described below. While the 
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non-rigid image registration algorithm described below is described in the 
context of skeletal structures, and in particular skeletal structures in the 
spinal region of the human anatomy, it should be understood that 
reference structures other than spinal skeletal structures may also be 
used with this non-rigid image registration algorithm. 
[0063] The non-rigid image registration technique is also described in the 
following five co-owned patent applications: 1) U.S. patent application 
Serial No. 10/880,486, characterized by attorney docket number ACCL- 
136, entitled "Fiducial-less Tracking With Non-Rigid Image Registration"; 2) 
U.S. patent application Serial No. 10/880,486, characterized by attorney 
docket number ACCL-137, entitled "Motion Field Generation For Non-rigid 
Image Registration"; 3) U.S. patent application Serial No. 10/880,486, 
characterized by attorney docket number ACCL-146, entitled "ROI 
Selection In Image Registration"; 4) U.S. patent application Serial No. 
10/881,208, characterized by attorney docket number ACCL-1 47, entitled 
"Image Enhancement Method and System For Fiducial-less Tracking of 
Treatment Targets"; 5) U.S. patent application Serial No. 10/880,486, 
characterized by attorney docket number ACCL-1 50, entitled "DRR 
Generation Using A Non-Linear Attenuation Model." All five patent 
applications, owned are incorporated by reference herein in their 
entireties. 

[0064] FIG. 5 illustrates a flowchart 400 of a non-rigid image registration 
algorithm that may be used for a 2D/3D registration between 2D DRRs of 
an anatomical region, reconstructed from pre-operative CT scan data, and 
intra-operative, near real-time x-ray projection images of the target within 
the anatomical region. In particular, the DRR is reconstructed from CT 
scan data representative of a CT image lj taken at a specific time point tj 
within the periodic cycle P. 

[0065] As a first step, 2D DRRs may be generated from pre-operative 3D 
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scan data representative of a CT image lj, in step 402. In one 
embodiment, the images for which non-rigid image registration is 
performed (i.e., DRRs and x-ray images) are discretized images each 
characterized by an array of pixels, each pixel having an associated pixel 
value representative of the intensity of the image at a surface unit area 
corresponding to the pixel. 

[0066] In one embodiment, an improved DRR generation process can be 
implemented in step 402 to bring out the skeletal reference structures, 
which are usually not easily visible in the images, or even may be hidden. 
In step 402, the CT scan data are modified based on a non-linear 
attenuation model that emphasizes the skeletal structures and thus 
improves the quality of the DRRs. In step 403 in flowchart 400, an image 
enhancement technique may also be implemented for the DRRs. In this 
step, a top-hat filter is used to bring out the skeletal structures in the 
DRRs generated in step 402. 

[0067] In the illustrated embodiment, image registration is performed in a 
selected region of interest (ROI) within the enhanced DRR, in order to 
improve efficiency. Accordingly, an ROI is defined in the DRR, in step 
404, after enhancement of the DRRs. The ROI selection process 
performed in step 404 is based on image entropy, and is fully automatic 
so as not to require user interaction. Intraoperative 2D x-ray projection 
images are then generated, in near real time, in step 410. Image 
enhancement is performed on the x-ray images, in step 415, using a top- 
hat filter by analogy to step 403. 

» 

[0068] Non-rigid image registration is then performed between the 
enhanced x-ray images and the enhanced DRRs, within the ROI. In 
particular, a similarity measure is used to compare the pixel intensities in 
the x-ray images and the DRR images, in order to determine any change 
in the position and/or orientation and/or physiological deformation of the 
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patient. In steps 420-450, a non-rigid deformation that describes real 
patient movement and body deformation is defined. To define the non- 
rigid deformation, a full motion field is constructed that is composed of 
many local motion fields, i.e. a plurality of locally estimated motion vectors. 
To estimate local motion at a given point of interest within the ROI, a 
similarity measure based on pattern intensity is used to compare pixel 
intensities. 

[0069] A full motion field that is composed of many local motions may 
describe any desired non-rigid deformation. Further, a full motion field 
derived in this manner can account for non-rigid motions (translations 
and/or rotations) of the object, in addition to non-rigid deformations, 
between different image acquisitions of the object. In order to efficiently 
compute the local motion vectors at any point of interest within the ROI, 
hierarchical mesh motion estimation and multi-level block matching 
(performed in conjunction with an intensity-based similarity measure) are 
performed. These methods allow for a fast computation of the image 
registration algorithm 400. A smoothness constraint is imposed to 
reconstruct the motion field at mesh nodes in which mismatching occurred. 
The non-rigid transformation parameters for the non-rigid image 
registration are then computed from the full motion field. 
[0070] In the embodiment illustrated in FIG. 5, the non-rigid deformations 
described by the full motion field occur in between the acquisition of the 
3D CT scan data of a treatment target region in a patient, and the 
acquisition of the x-ray projection images of the target region. In step 420, 
a global translation of the entire image is first estimated. The estimated 
global translation is used as the initial estimate for all further local motion 
estimation. In the next step 430, mesh nodal motion estimation is 
performed, using a hierarchical mesh structure designed to estimate the 
local motion in multiple levels. In the next step 440, motion field 
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reconstruction is performed for those mesh nodes in which a mismatch 
occurs. The reconstruction of the motion field is performed by imposing a 
smoothness constraint, which is based on the assumption that local 
motions are continuous, because of matter coherence. In step 450, the 
local motion vector at any desired point of interest is derived by 
interpolating from the nodal motions estimated for the mesh nodes that 
surround the point of interest. The full motion field is then constructed, 
using the local motion vectors derived for a plurality of desired points of 
interest. 

[0071] In the final steps, shown as step 455 and step 460 in FIG. 5, the 
non-rigid transformation parameters are derived from the full motion field. 
In step 455, the target displacements are derived from the full motion field. 
In step 460, the average rigid transformation is derived from the full 
motion field. 

[0072] The quality of DRR images relies on proper attenuation modeling, 
as well as a proper interpolation scheme for interpolation the CT numbers. 
In one embodiment, in step 402 (in the flowchart shown in FIG. 2), an 
improved x-ray attenuation model is formulated for fiducial-less tracking, 
so that the DRRs become more like the real x-ray projection images. A 
linear attenuation model is no longer assumed, and the CT numbers are 
modified in order to compensate for the above-mentioned a difference in 
the bone-to-tissue attenuation ratio. On the basis of many experiments 
conducted with patient clinical data, the following empirical equation was 
formulated to modify the original CT numbers: 

C(x,y,z) = aC 0 (x,y,z)e bC ^ z) (1) 
where C(x,y,z) represents the modified CT number of a 3D CT voxel 
located at a point (x,y,z); 
a and b represent weighting coefficients; 

and Co(x,y,z) represents the un-modified CT number, based on a linear 
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attenuation model, of a 3D CT voxel having a location (x,y,z). 
[0073] The interpolation scheme used in one embodiment to improve the 
quality of DRRs is bi-linear interpolation. In this embodiment, bi-linear 
interpolation is performed in step 402, to integrate the CT numbers along 
the CT voxels that are encountered by each cast ray. In one embodiment, 
the bi-linear interpolation is followed by a 1-D polynomial interpolation 
over three voxel slices, for each voxel of interest. The three voxel slices 
include the voxel slice containing the voxel of interest, plus each adjacent 
voxel slice. 

[0074] In some embodiments, fiducial-less tracking relies on skeletal 
reference structures (e.g. vertebral structures) that are usually not easily 
visible, or may even be hidden in the DRRs and in the x-ray projection 
images. Because such skeletal structures have to be registered, both the 
DRR and the x-ray images have to be enhanced to bring out the details of 
the vertebral structures and improve their visibility. In one embodiment, 
therefore, image enhancement is undertaken for both the DRRs and the 
x-ray projection images. In most thoracic and lumbar cases, the skeletal 
structures are not easily visible or even hidden in DRR and X-ray images. 
For these cases therefore, enhancement of the DRR and the x-ray images 
is necessary, in order to make registration at all possible. In cervical 
cases, the skeletal structures of spine are well visible in both the DRR and 
the x-ray images, but the details of the structures are still not clear. 
Accordingly, in cervical cases, the DRR and the x-ray images should be 
enhanced to improve the registration. 

[0075] In the embodiment illustrated in FIG. 5, a top-hat filter is designed 
and used to enhance the x-ray images (step 415 in FIG. 5) and to 
enhance the DRR images (step 403 in FIG. 5). In particular, the skeletal 
structures in the images are enhanced, i.e., brought out, by applying a top 
hat filter operator to the pixels of the x-ray projection images and the DRR 
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images. As known, a top hat filter is a nonlinear operator that finds the 
brightest pixels in two different size neighborhoods, then keeps the 
extreme values. In one embodiment, the top hat filter operates as follows: 
if the brightest value in the smaller neighborhood region is greater that the 
value in the larger neighborhood, by an amount determined by a user- 
entered threshold, then the pixel remains, otherwise it is eliminated. As a 
result of applying a top hat filter to the images, it is possible to locate 
features of interest. 

[0076] In one embodiment, the top-hat filter is designed by using a 
weighted combination of image opening and closing with a certain 
structural element. The top hat filter operator is defined mathematically as 
follows: 

f e = f + wx[f-YB(f)]-bx[<p B (f)-f] ( 2 ) 
= f + wxWTH(f)-bxBTH(f) 
where f e represents the enhanced image, resulting from the application of 
the top hat filter operator to each pixel in the original image; 
f represents the original image; 
w and b represent weighting coefficients, 

y B (f) represents a structural element for the opening of the original image f, 
and 

<p B (f) represents a structural element for the closing of the original image f . 
[0077] In expression (2) above, WTH(f) = f- Yb(0 is called a white top-hat 
filter, whereas BTH(f) = <pB<f) - f is called a black top-hat filter. The 
structural elements yb(t) and <p B (f) are masks that are used to perform the 
basic morphological operation. The sizes of the structural elements vary 
slightly for cervical, thoracic, and lumbar applications. The empirical 
values are determined experimentally. The weighting coefficients w and b 
are determined adaptively by the amplitudes of WTH(f) and BTH(f), 
respectively. Empirically, the values of the weighting coefficients w and b 
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have been found to be about 1 each (w=1 , b=1 ), for a cervical case in 
which less tissue is present. In the lumbar case, in which more tissue is 
present, the values of w and b have been found to be greater than about 2 
each (w > 2, b>2). In the lumbar case, the weighting process brings out 
the skeletal structures to a greater degree, compared with the cervical 
case. 

[0078] In one embodiment, image registration is conducted only in a 
certain region of interest (ROI) defined in the DRR. The ROI contains the 
treatment target (e.g. a tumor or lesion). In one embodiment, image 
entropy is specifically defined, in step 404 in FIG. 5. In this way, the ROI 
can be automatically selected, for optimum registration, minimizing or 
even eliminating user interaction. Because image registration relies on 
the image content or image information, in this embodiment the ROI is 
optimized to contain as much information as possible. 
[0079] The Shannon entropy, known from conventional communication 
theory, is commonly used as a measure of information in signal and 
image processing. It is defined as H = - £ n pi log pi, where H represents 
the average information supplied by a set of n symbols whose 
probabilities are given by pi, p 2 , . . . , p n . When applied to the pixels of 
each image (as enhanced in steps 403 or 415 in FIG. 5), the Shannon 
entropy for each image is defined by: H = - £i p(l) log p(l), where I is the 
image intensity level, and p(l) is the probability of an image intensity value 
I occurring within the ROI. In the original formulation by Shannon, any 
change in the data that tends to equalize the probabilities pi, P2, . . . , p n 
increases the entropy, as observed by Shannon. For a given image, the 
Shannon entropy is conventionally calculated from a image intensity 
histogram, in which the probabilities pi, p 2 , . . . , p n are histogram entries. 
[0080] In one embodiment, the Shannon entropy H is modified, based on 
the fact that the skeletal structures occur in bright areas. In this 
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embodiment, a modified Shannon entropy is used for each image, which 
is defined as follows: 

H= -IilpO)logp(l), (3) 
where again I is the image intensity level, and p(l) is the probability of the 
image intensity value I occurring within the ROI. In step 404 (shown in 
FIG. 5), the modified Shannon entropy is first determined for the 
enhanced DRR image. Once the modified Shannon entropy H is 
calculated, an ROI is then automatically selected by determining the 
region within the DRR for which the entropy H is maximized. Subsequent 
steps in the image registration process (steps 420 - 450 in FIG. 5) take 
place only within the ROI. 

[0081] Restricting the image registration process to within a ROI has 
several advantages. One advantage is that such a restriction can speed 
up the registration process, since the registration needs to be performed 
only for the ROI. For example, the similarity measure needs only be 
computed for the ROI, and block matching need only be performed within 
the ROI. Further, the registration process is more accurate when limited 
to an area within the ROI. The more limited the region in which 
registration is conducted, the less likely it is that structures within the ROI 
would have moved relative to each other between the time of the pre- 
operative CT scans and the time of the medical treatment. 
[0082] Based on the improved and enhanced DRRs (generated in steps 
402 and 403 in FIG. 5), and the enhanced x-ray projection images (step 
415 in FIG. 5), in which the skeletal reference structures have been 
brought out to make fiducial-less tracking possible, a non-rigid 
deformation of the anatomical region is determined in steps 420-450. In 
this patent, a 'rigid body' assumption, i.e. which is often made in image 
registration applications, and which assumes that between image 
acquisitions, the anatomical and pathological structures of interest do not 
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deform or distort, does not have to be made. If a rigid body assumption is 
not needed, there is no longer a need to preserve the 'rigid body 1 
constraints, i.e. to require that the body be rigid and not undergo any local 
variations during the transformation. Based on an abundance of 
observations and analyses on clinical patient data, in the present patent a 
non-rigid deformation is assumed, in lieu of a rigid transformation, to 
obtain an improved description of the real patient movement and body 
deformation. By computing a non-rigid deformation field, patient position / 
orientation can be more reliably monitored and corrected during the initial 
alignment, as well as throughout the entire treatment. 
[0083] A non-rigid image registration allows the inherent local anatomical 
variations that exist between different image acquisitions to be accounted 
for, in contrast to a rigid image registration which does not allow the 
overcoming of such variations. Non-rigid registration defines a 
deformation field that provides a translation or mapping for every pixel in 
the image. In one embodiment, a full motion field, composed of many 
local motion vectors or fields, is computed in order to derive the non-rigid 
deformation field. 

[0084] In order to estimate local motion fields, in one embodiment, a multi- 
level block matching method is used in conjunction with a similarity 
measure based on pattern intensity. This approach allows the local 
motion to be rapidly and accurately estimated in most parts of the ROI. 
Multi-level block matching, which allows for computational efficiency, is 
described in conjunction with a rigid registration algorithm, in a commonly 
owned application, U.S. Serial No. 10/652,786 (the "786 application"), 
incorporated by reference in its entirety. A similarity measure based on 
pattern intensity, used in conjunction with a registration algorithm based 
on rigid transformations, i.e. the "FAST 6D algorithm" developed by 
Accuray, Inc. for use with the Cyberknife radiosurgery system, is 
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described in full in commonly owned applications, U.S. Serial Nos. 
10/652786 (the "786 application"), 10/652717 (the "717 application"), and 
10/652785 (the "785 application"), which are all incorporated by reference 
in their entireties. In the present patent, the pattern intensity based 
similarity measure and the multi-level block matching method are used in 
conjunction with a registration algorithm based on a non-rigid (rather than 
a rigid) transformation. The pattern intensity-based similarity measure, 
originally developed for a rigid image registration algorithm, provides a 
powerful and efficient technique for solving the 2D/3D image registration 
problem, also in a non-rigid framework. 

[0085] In one embodiment, block matching is performed, i.e. a small block 
centered around a point of interest is used in order to locally estimate the 
displacements at each desired point within the ROI. As known, when 
using block matching to register a first image onto a second image, the 
first image is divided into different blocks, typically rectangular boxes of 
equal size. Each point of interest, which may be a mesh node, or may be 
a non-node pixel that is surrounded by mesh nodes, is taken as the center 
of one of the blocks. These blocks are then translated so as to maximize 
a local similarity criterion, which in one embodiment is the pattern intensity 
based similarity measure, described above. 

[0086] In block matching methods, it is generally assumed that each pixel 
in a block has the same motion, and a block matching algorithm is 
typically used to estimate the motion vectors for each block. In a block 
matching algorithm used in one embodiment, a search is conducted for a 
matching block in the second image, in a manner so as to maximize a 
measure of similarity, based on pattern intensity, between the respective 
blocks. The search is for a location of the maximum in the similarity 
measure function, the maximum representing the existence of a matching 
block in the second image. The search may be conducted within a search 
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window that is defined around the point of interest and that contains the 
block. 

[0087] In any block matching algorithm, it is important to optimize the 
search strategy, and to select an appropriate block size. For small blocks, 
the translational rigid model is typically assumed. Even though rigid 
rotations or some other complicated deformations exist, the rigid body 
translation model is valid for estimating the translations for the block 
center point. When rotations or other deformations exist in addition to the 
translations, the accuracy increases with decreasing block size, and 
decreases with increasing block size. With the use of smaller block sizes, 
however, the possibility of mismatching increases. In one embodiment, a 
block size selection strategy is adopted in which it is assumed that larger 
blocks are needed for larger displacements, and that smaller blocks are 
need for smaller displacements. 

[0088] FIG. 6 schematically illustrates local motion estimation for a point of 
interest within a target in a patient, using block matching. In the 
embodiment illustrated in FIG. 6, the target is located in the cervical 
region of the spine, although it is again emphasized that the non-rigid 
2D/3D image registration technique can be used in applications other than 
structural spine tracking. The left and the right pictures are the DRR and 
the x-ray images, respectively. A small block 203A is defined around a 
point of interest 205 in the DRR. Also, a search window 207 that 
encompasses the block 203 is defined in the DRR. The matching block in 
the x-ray image is indicated in FIG. 6 with reference numeral 203B. In the 
embodiment illustrated in FIG. 6, the size of the search window 207 is 
48mm x 48mm, and the block size is 15 x 15 mm. It can be seen, simply 
by visual inspection, that the point of interest 205 is well located in the X- 
ray image. 

[0089] FIG. 7 schematically illustrates a multi-resolution image 
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representation, when implementing multi-level block matching, using 
multiple candidates. Multi-level block matching is a fast search method 
that uses the displacement estimates made at a lower level as the initial 
results for subsequent search phases. The basic idea in multi-level block 
matching is to match the images at each of a plurality of resolution levels, 
successively, starting from the lowest resolution level and moving up to 
the highest resolution level. The full-size image, having the highest 
resolution level, is shown at the bottom in FIG. 7, as level 1. The upper 
images (level 2 and level 3) have successively lower spatial resolutions, 
the image having the lowest resolution being shown as level 3. The lower 
resolution images are obtained by lower pass filtering, and sub-sampling 
the full-size images. 

[0090] In Figure 7, assuming that the full image block size is Wx H in 

W H W H 
Level 1 , the block sizes are — x— and — x— in Level 2 and Level 3, 

2 2 4 4 

respectively, as indicated in the figure. In the lowest resolution level 

(Level 3), a large search range is used to enable estimation of large 

displacements. A very small search range (-2, +2) is used in the rest of 

the resolution levels. 

[0091] The results at the lower resolution level serve to determine rough 
estimates of the displacements. The output at the lower level is then 
passed onto the subsequent higher resolution level. The estimated 
motion vector (in most cases, a translation vector) for the block is 
successively refined, using the higher resolution images. In the final 
matching results, the accuracy of the estimated translations depends on 
the spatial resolution of the highest resolution images (shown as level 1 in 
FIG. 7). 

[0092] There is some risk in multi-level matching. It is possible that the 
estimate at lower levels may fall in a local maximum, and far away from 
the global maximum that is being sought. In this case, further matchings 
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at subsequent higher resolution levels may not converge to its global 
maximum. To overcome this risk, multiple candidates are used for the 
estimates, in one embodiment. Many candidates that have shown optimal 
matching results are passed on from the lower levels to the higher 
resolution levels. The more candidates that are used, the more reliable 
are the estimates. In one embodiment, the best candidates are ranked by 
the similarity measure function values. 

[0093] In one embodiment, a similarity measure based on pattern intensity 
is used, in conjunction with multi-level block matching. As mentioned 
earlier, this similarity measure is a key element contributing to the success 
of the "FAST 6D algorithm," described in the commonly owned 786 
application, 717 application, and 785 application. In one embodiment, 
the similarity measure is determined by forming a difference image 
between the "live" (or near real time) x-ray projection images and the DRR 
images, and applying upon each pixel of the difference image a pattern 
intensity function. Specifically, the difference image l<nn(i,j) is formed by 
subtracting a corresponding pixel value of the pre-operative DRR image 
from each pixel value of the intraoperative x-ray projection image, within 
the ROI: 

I*{U)=Iu*(U)-I!*r{U) ( 4 ) 

[0094] In equation (4), l(i,j) represents the image intensity value of a pixel 
located at the Mh row and j-th column of each pixel array for the 
respective image. Specifically, lafi.J) represents an array of pixel values 
for a difference image formed by subtracting the corresponding pixel 
values of the second image from each pixel value of the first image. 
fiive(ij) represents the (ij)-th pixel value of the first image of the object. 
IdrrO j) represents the (i,j)-th pixel value of the second image of the object. 
The similarity measure operates on this difference image, and is 
expressed as the summation of asymptotic functions of the gradients of 
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the difference image over the pixels within a neighborhood R: 



a 2 



a 2 + (I m dJ)-I diff (i + kj + /)) 2 (5) 

[0095] In equation (5) above, the constant o is a weighting coefficient for 
the pattern intensity function. The sensitivity of the solution to the 
variation of x-ray image can be minimized by careful selection of this 
constant. The larger the weighting coefficient, the more stable the results. 
However, the choice of a entails a tradeoff between stability and accuracy. 
When the value of a is too large, some small details in the images cannot 
be reflected in the similarity measure. Based on the experiments, the 
empirical value for a is in the range from about 4 to about 16, in one 
embodiment. 

[0096] FIG. 8 schematically illustrates a neighborhood R for calculating a 
similarity measure based on pattern intensity. As seen from FIG. 8, the 
neighborhood R in the illustrated embodiment is defined so that the 
gradients of the difference image can be considered in at least four 
directions (horizontal, vertical, 45° diagonal and -45° diagonal). When the 
neighborhood R is defined in this manner, equation (5) for the similarity 
measure becomes: 

y <* h y Z . 

if* % +((i„(ij)-i„(ij-i)f ^ 2 + ((/ # 0;i)-V0'-l;)) 2 

(6) 

y v 1 x y Zl 

if a 2 + {v* ff ft J) - h iff c - u - 1)) 2 u^+^y-^p-u+i)) 1 ' 



[0097] Equations (5) and (6) for pattern intensity have several advantages. 
First, the difference image filters out the low frequency part that 
predominantly consists of the soft tissues, and keeps the high frequency 
part that predominantly consists of the skeletal structures. This feature 
makes the algorithm robust to some brightness intensity difference 
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between live and DRR images. Second, because of the asymptotic 
function, the measure is less affected by the pixels whose intensity value 
slightly deviates from its neighboring pixels. These types of pixels are 
thought to contain random noise. Third, because the asymptotic function 
quickly approaches to zero when the variable increases, large intensity 
differences such as image artifacts have the same effects on the similarity 
measure regardless of their magnitude. Due to this feature, the pattern 
intensity is less sensitive to image artifacts. 

[0098] The estimation of local motion fields using block matching together 
with hierarchical mesh motion estimation, as well as the reconstruction of 
the full motion field from the plurality of locally estimated motion fields, are 
performed in steps 420 - 450 of the flowchart shown in FIG. 5. Fast 
generation of the full motion field is achieved by using hierarchical mesh 
tracking, and using SIMD (single instruction multiple data) technology to 
perform image computation in parallel. 

[0099] In one embodiment, a global translation of the entire image 
(measured as a translation of the image center of the image) is first 
estimated, then used as the initial estimates for all further local motion 
estimation. In other words, a rough estimate is made of the center 
displacement for the entire image, and is used as the starting estimate for 
all local displacements. Referring back to FIG. 5, the first step (indicated 
with reference numeral 420 in FIG. 5) in generating a full motion field for a 
target, between the pre-operative scan and the intraoperative treatment, 
is the step of estimating a global translation for the entire image, or 
equivalently, estimating the center displacement of the image. 
[00100] FIG. 9 illustrates the estimation of global motion (in this 

case, translation only), between the image center of a DRR and the image 
center of a corresponding x-ray image. In the illustrated embodiment, the 
image center is used as the block center. The step of global translation 
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estimation is very important, because any failure during this step will affect 
the rest of the local motion estimation process. To prevent any possibility 
of mismatching, a very large image block is used in the illustrated 
embodiment. The maximum tracking range can be calculated as the 
difference between the block size and the entire image size. For example, 
if the matching size is 80x80 mm, the maximum tracked translation is 60 
mm. In the embodiment illustrated in FIG. 9, a block having a size of 160x 
160 pixels (64 mm x 64 mm) is used. The search window in the illustrated 
embodiment is the entire image. The maximum track range for the 
illustrated embodiment is (-50 mm, +50 mm). 

[00101] After global motion estimation, the next step 430 (see FIG. 5) 
is mesh motion estimation. In this step, a hierarchical 2D mesh structure 
is designed in order to estimate local motion in multiple levels. As known, 
a 2D mesh (or a 2D mesh grid) refers to a tesselation of a 2D region into 
polygonal patches or elements, whose vertices are called mesh nodes. 
Unlike block matching algorithms, which generally assume only 
translational motion, 2D mesh models allow for spatial transformations to 
model rotations, scalings, and deformations of the object that was imaged, 
in addition to translations of the object Compared to block matching 
algorithms, therefore, mesh-based methods may produce a more 
accurate representation of the motion field, for example may generate 
continuously varying motion fields. 

[00102] FIG. 10 schematically illustrates a mesh grid 300 
established for a DRR of a target region, and a corresponding mesh grid 
302 established for an x-ray image of the target region, in an embodiment 
in which the target is located within the cervical region of the spine. With 
a 2D mesh, motion compensation within each mesh element or patch may 
be accomplished by deriving a spatial transformation between the images, 
where the transformation parameters are computed from the nodal motion 
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vectors, i.e. from the motion vectors that are estimated for the mesh 
nodes that are located at the vertices of the mesh. In other words, mesh- 
based motion estimation consists of finding a spatial transformation that 
best maps one set of mesh elements in a first image acquisition onto 
another set of mesh elements in a second image acquisition. 
[001 03] In particular, mesh motion estimation consists of finding the 
vertices of corresponding mesh elements in the other image, i.e. finding 
the corresponding mesh nodes in the other image, such that errors are 
minimized in the overall motion field. Typically, a number of mesh nodes 
are selected in one image, and the corresponding mesh nodes in the 
other image are estimated. For any pixel located within a mesh element 
(as opposed to being located on the vertices of the mesh elements), the 
mapping between different image acquisitions is performed through 
interpolation. The local motion vectors for such pixels are estimated by 
interpolating from the nodal motion vectors that were estimated for the 
mesh nodes that surround the pixel. 

[00104] In one embodiment, hierarchical mesh motion estimation 
may be performed. By hierarchical mesh motion estimation, it is meant 
that nodal motion is estimated for the mesh nodes that define the mesh 
structure, for each of a plurality of mesh resolution levels. Motion 
estimation performed with a coarse mesh provides the initialization for the 
subsequent (finer) resolution levels of the mesh. To estimate the motion 
of each mesh node, multi-level block matching may be performed. 
[00105] FIG. 1 1 illustrates a mesh hierarchy, during mesh motion 
estimation. As seen from FIG. 1 1 , the mesh hierarchy starts from a 
relatively coarse mesh, 320, and progresses onto finer meshes, illustrated 
as 322 and 324. Using the global translations (estimated in step 420 of 
FIG. 5 as the initial estimates), nodal motion for the mesh nodes located 
at the vertices of the most coarse mesh is first calculated. These 



34 



2006/039009 



PCT/US2005/030081 



estimates are then passed onto the subsequent, finer mesh. At each level, 
nodal motion is updated, using a smaller search range. Finally, the 
motion vectors for the mesh nodes at the final one of the mesh resolution 
levels (characterized by the finest mesh resolution level) are refined. For 
all the nodes, multi-level block matching with multiple candidates is used, 
together with the pattern-intensity based similarity measure, given in 
equations (5) and (6). 

[00106] FIG. 12 schematically illustrates the passing on of node 
estimation, from a coarse mesh resolution level onto a finer mesh 
resolution level. At each mesh resolution level after the first level, the 
mesh nodes include both 1 ) mesh nodes generated at a previous mesh 
resolution level; and 2) mesh nodes that are newly added at the current 
mesh resolution level. In the illustrated embodiment, the initial estimates 
for nodal motion vectors, for the newly added nodes at the current mesh, 
are obtained by linear interpolation of the existing nodal motion vectors, at 
the previous mesh resolution level. During this process, any unreliable 
mesh node needs to be detected, so that only reliable nodes are passed 
onto the subsequent mesh level. 

[00107] FIG. 12 illustrates how such a detection can be performed, 
using a mesh node referred to in FIG. 12 as 'node 5/ In the illustrated 
embodiment, the difference between the motion vector (in this case, 
translation vector) of node 5, and the median motions (translations) 
computed from its 9 surrounding nodes (nodes 1-4, 6-9 in FIG. 12) is 
taken. As seen from FIG. 12, the translation of node 2 is the average of 
the translations of node 1 and node 3; the translation of node 4 is the 
average of the translations of node 1 and node 7; the translation of node 6 
is the average of the translations of node 3 and node 9; and the 
translation of node 8 is the average of the translations of node 7 and node 
9. The translation of node 5 is the average of the translations of nodes 1, 
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3, 7, and 9. If the difference between the translation of node 5 and the 
median translations computed from its 9 neighboring nodes is less than a 
predefined threshold, the node 5 is considered as a reliable node. 
Otherwise, it is considered as an unreliable node, and its translations are 
replaced with the median values and passed to the subsequent mesh. 
[001 08] For most mesh nodes, the estimates of motion are reliable 
and accurate. For a few nodes where mismatching may occur and the 
estimation may not be reliable, the displacements need to be 
reconstructed by the surrounding node displacements. Accordingly, the 
next step in the registration algorithm flow chart in FIG. 5 is step 440 of 
motion field reconstruction, during which the motion field is reconstructed 
from surrounding nodes, for those nodes in which mismatching occurs. 
The inaccurate nodal motion vectors can be detected by using 3x3 
median filtering. 

[001 09] Local motion estimation relies on the local image content. In 
some smooth local regions, mismatching may occur. During mesh motion 
estimation, the estimation in most nodes is pretty accurate. For a few 
nodes where mismatching occurs, the motions should be reconstructed 
from their surrounding nodes. What is known a priori is matter coherence 
of bone and tissue, and accordingly, the smoothness of local motion. In 
other words, the estimated local motion vectors are thought to be smooth 
and continuous, because of matter coherence. By imposing this 
physically-based smoothness constraint, a cost function is formulated to 
reconstruct the motion field. 

[001 1 0] In one embodiment, the cost function is expressed 
mathematically as follows: 



In equation (7) above, E(d) represents the cost function, d represents a 
desired local estimate for a nodal motion vector at coordinates (x,y), u 
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represents a locally estimated nodal motion vector at coordinates (x,y), 
and ft represents a reliability constant that ranges from 0 to 1 , where (1 = 0 
indicates an unreliable estimation, and fi=1 indicates a reliable estimation. 
[001 11] By performing a finite difference of the derivatives over the 
mesh grids, a discretized form for the cost function in equation (7) is 
expressed as: 

(8) 

where u i } represents the locally estimated translations, d u is the local 
motion desired, fi tJ =1 if the estimation is reliable and fi {J = 0 if the 
estimation is unreliable. 

The first term on the right side of equation (8) reflects the fidelity to the 
observed data in the reconstruction. The second term imposes the 
smoothness constraints on the motion field in two spatial directions. 
[001 1 2] The minimization of the cost function given by equation (8) 
results in a system of simultaneous linear equations 



dd tJ 



0) 

In one embodiment, the iterative algorithm of successive-over relaxation 
(SOR), which is fast and convergent, is used to solve the equations: 

(10) 

[001 1 3] Once all the nodal motion vectors have been estimated at all 
the mesh nodes, the translations for any point (or pixel) inside the ROI 
can be computed by interpolation. FIG. 1 3 schematically illustrates the 
determination of a motion vector for a point of interest, by interpolation 
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from surrounding nodes. In the illustrated embodiment, quadratic 
interpolation is performed, using the 9 nearest nodes, and 9 shape 
functions are used. 

[001 14] Assuming the motion vector (dx(i) 9 dy(i)) for nine nodes, the 
motion vector (dx 9 dy) at the point of interest is computed using the 
following expressions: 

(11) 

where N(i) is the shape function for the node (/), and where N(i) for I = 1 , 

2, ... 9 are given as follows: 

tf(l) = (l-£Xl-7)/4-to+tf5)/2, 

N(2) = (l-ill-n)/4-(N s +N 6 )/2, 

N(3) = (l + 4)(l+v)/4-(N 6+ N 1 )/2, 

N(4) = (l-tXl+Tj)/4-{N 1 +N s )/2, 

N(5) = (l-Z 2 Xl-rj)/2, 

tf(6) = (l-#Xl-V>2, 

N(7) = (l-4 2 \l + Tj)/2, 

AT(8) = (l-#Xl-7 2 )/2, 

iV(9) = (l-^ 2 Xl-^ 2 ) 
(12) 

[001 1 5] Using steps 420, 430, and 440, described above, the local 
motion vectors can be estimated for a plurality of points of interest within 
the ROI. The full motion field is obtained as a composite or superposition 
of all of the local motion vectors that are estimated for the many points of 
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interest that have been selected for motion estimation. FIG. 14 
schematically illustrates, in vectorial form, a full motion field (reconstructed 
from many estimated local motion vectors), in an embodiment in which the 
target is located within the cervical region of the spine. 
[001 1 6] The final step in the image registration process is target 
localization, namely deriving the target translations and rotations from the 
full motion field that has been determined. In one embodiment, non-rigid 
image registration seeks to determine a projection mapping or 
transformation between different coordinate systems in respective image 
acquisitions such that points in each space which correspond to the same 
anatomical point are mapped to each other. In one embodiment, the 
transformation is represented by a set of non-rigid transformation 
parameters (dx T , dy T , dz T , r, p, w), where (dx T) dy T , dz T ) represent the 
translations of the target, and (r, p, w) represent rotations of the target. 
[001 17] In one embodiment, two orthogonal x-ray projections are 
used to solve for these six parameters. In this embodiment, the 
registration in each projection is performed individually, and the results of 
the registration for each projection are subsequently combined, to obtain 
the six 3D transformation parameters. In other embodiments, however, 
different projections or combinations thereof may be used to solve for the 
transformation parameters. 

[001 18] FIG. 15 illustrates the geometric relations between a three- 
dimensional treatment target, and two orthogonal 2D x-ray projections 
(labeled A and B in FIG. 15), in a non-rigid image registration method in 
accordance with one embodiment. A pair of cameras (or image receivers) 
A and B receive their x-ray projections from respective x-ray sources (not 
shown). In the coordinate system of the 3D scan, the x-axis is directed 
inward into the paper, and is not indicated in FIG. 15. As explained above, 
the change in position of the treatment target is represented by three 
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translations and three global rigid rotations (dx, dy, dz, r, p, w). 
[001 19] In FIG. 15, the orthogonal 2D projections A and B are 
viewed from the directions o A s A and o B s B , respectively. For each of the 
projections A and B, FIG. 15 illustrates respective 2D planar coordinate 
systems that are fixed with respect to the image plane that characterizes 
each projection. The image planes A and B for the projections A and B 
are thus defined by mutually orthogonal axes within the respective 
coordinate systems. These axes are shown in FIG. 15 as (xa,Ya) for 
projection A, and (x s , ya) for projection B. The direction of the axis x A in 
the 2D coordinate system for projection A, and the direction of the x-axis 
in the 3D scan coordinate system, are opposite with respect to each other. 
The direction of axis x B in the coordinate system for projection B, and the 
direction of the axis x in the 3D scan coordinate system, are the same. 
[00120] For projection A, the 2D motion field (dx A dy A ) is estimated 
by registering the x-ray image that is projected onto the image plane A, 
with the corresponding reference DRR image. For projection B, the 2D 
motion field (dx s , dy a ) is estimated by registering the x-ray image that is 
projected onto the image plane B, with the corresponding reference DRR 
image. Given the 2D motion fields (dx A dy*) for projection A, and (dx s , 
dy B ) for projection B, the 3-D target translation (dx T ,dy T ,dz T ), as well as 
the global rigid rotations (r, p, w), can be obtained for both projections A 
and B, by a straightforward mathematical operation. 
[001 21 ] Referring back to FIG. 5, the 3-D target translation 
(dxT,dy T ,dz T ) can easily be obtained in step 455 (shown in FIG. 5), given . 
the 2D local motion fields (dx A dy A ) for projection A, and (dx s , dy 8 ) for 
projection B, using the following expressions: 
dx T =(dx TA +dx TB )/2, 

dz T ={dy TA +dy TB )l4l 
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(13) 

[00122] The global rigid rotations (r, p, w) can be calculated from the 
motion fields (dx A , dy A ) in projection A and (dx B , dy B ) in projection B. 
Using the target as the rotation center, global rigid rotations are useful for 
position and rotation correction and compensation during initial patient 
alignment and treatment. Because the target translation is already 
calculated, the calculation of the global translation is not needed. To get 
the three rotations in 3D patient coordinates, three 2D in-plane rotations 
are first computed, including the in-plane rotations 0 A and 0 B in 
projections A and B, respectively, and the in-plane rotation 6 X in a plane 
perpendicular to the inferior-superior axis. Approximately, the global 
rotations can be expressed as: 



w = (0 b +0 a )/J2, 
(14) 

[00123] Estimation of 0 A and 0 B is directly based the 2D motion 
fields in projections A and B, respectively. To estimate 9 X , a plane is first 
defined, which passes the target point and is perpendicular to the axis x in 
the 3D patient coordinate system. Then the motion field is calculated from 
the two motion fields {x A9 y A ) and (x 9i y B ) in projections A and B, 
respectively. 

[001 24] Assuming (dx, dy) is the motion field in the corresponding 
coordinate (x, y) and 0 is the global rotation, when the rotation is small (< 
10°), the following transformation equation is valid: 




(15) 
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[00125] Given (dx,dy) and (x,y) in many points, 6 can be easily 
calculated using least square minimization method 

X(x(i)dy(i)-y(0dx(i)) 

[00126] 0=< ^ 

I^OWO + ^OMO) 

i 

(16) 

Using equations (14) and (16) above, the average rigid transformation 
parameters can be obtained, in step 160 illustrated in FIG. 5. 
[00127] Using the results of non-rigid registration, obtained as 
described above, dynamic tracking of the targets can be performed during 
treatment delivery, by using the 4D mathematical model, and by 
monitoring the breathing (or other motion) cycle during delivery of 
radiation. FIG. 16 schematically illustrates dynamic tracking of a moving 
target within a breathing patient during delivery of treatment radiation, 
using surface markers 340 (e.g. infrared LEDs) to monitor the breathing 
motion of the patient as a function of time. Although surface markers are 
described with respect to the embodiment illustrated in FIG. 16, in other 
embodiments other techniques may be used to track the surface of the 
patient. These techniques include, but are not limited to: video systems; 
laser scanners; ultrasound (or other acoustic) scanners; and when 
tracking heartbeat motion, electrocardiograms. Any method and device 
known in the art to track patient surfaces may be used. 
[001 28] Dynamic tracking of the target 21 0 (i.e. the tumor/lesion 21 0) 
during treatment delivery can achieved by combining the 4D mathematical 
model obtained during 4D treatment planning, with the registration 
information provided by fiducial-less tracking, and monitoring of the 
patient breathing motion using surface markers. As explained above, the 
4D mathematical model obtained during treatment planning relate the 
locations of the skeletal structures 230 and 232 to the locations of the 
target 210. 
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[001 29] A number of approaches to dynamic tracking are possible, 
when using fiducial-less tracking in association with 4D planning and 
dynamic tracking of surface markers, as schematically shown in FIG. 16. 
In a first approach, the surface markers 340 are first used to determine 
the instant in the breathing cycle. The rigid reference structures 232 can 
then be located, using non-rigid fiducial-less tracking. The location of the 
tumor/ lesion 210 is then tracked, by drawing vectors from the reference 
locations determined using non-rigid image registration. The knowledge 
of the instant in the breathing cycle, obtained using the surface markers 
340, and the models obtained from 4D treatment planning that relate the 
locations of the target 210 to the locations of the rigid skeletal structures 
232, are used to draw vectors from the rigid reference structures 232 
whose locations have been obtained during non-rigid image registration. 
[001 30] In a second approach, the tumor/lesion 21 0 is located, using 
the location of the reference structures 232 and 230 obtained from 
fiducial-less tracking, and using a model that relates the location of the 
reference structures 232 and 230 to the lesion 210, obtained from 4D 
treatment planning. Next, a mathematical model is built that relates the 
motion of the surface markers 340 to the motion of the lesion 21 0. In 
image processing, it is known in the art to relate the motion of a first type 
of object with the motion of a second type of object, and to describe such 
a relation in terms of a mathematical model. Algorithms or software that 
are known and that may be commercially available can be used to built 
the mathematical model that relates the motion of the surface markers 
340 to the motion of the lesion 210. 

[001 31] In a third approach, the locations of the reference structures 
that are determined from the 4D model obtained during 4D treatment 
planning are used to locate the lesion by interpolation. A mathematical 
model is then built that relates the motion of the surface markers 340 to 
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the motion of the lesion 210. This third approach involves the least 
dependence on the 4-D planning model obtained from treatment planning. 
[001 32] In a final approach, the tumor or lesion 21 0 is directly 
tracked, using the 2D / 3D registration techniques described in the above 
paragraphs. In this approach, the model relating the motion of surface 
markers to the location of the target 210 can be built directly, using just 
the results of the 2D/3D registration. 

[00133] Once the targets have been located, using one of the 
approaches described above, radiation beam delivery can be 
implemented. The real time locations of the targets within the moving 
anatomical region, which are determined as described in the previous 
paragraphs, provide guidance for beam delivery. During treatment 
planning, the beam trajectories are initially defined with respect to a 
nominal patient co-ordinate system, perhaps chosen to orient with respect 
to one of the several CT studies acquired to cover the motion cycle. This 
epoch in the motion cycle is determined by analyzing the motion of the 
surface markers, and each radiation beam is to be turned on from this 
epoch. 

[001 34] FIG. 1 7 provides a schematic block diagram of an apparatus 
500 for dynamically tracking targets 210 within an anatomical region that 
is undergoing periodic motion having a cycle P, and for delivering 
therapeutic radiation to the moving targets. The targets may include 
tumors or lesions. The anatomical region includes reference structures in 
addition to the targets 210. The reference structures may include rigid 
reference structures 232 which do not move during the periodic motion, 
anc j reference structures 230 which themselves move during the periodic 
motion of the anatomical region. 

[00135] In overview, the apparatus 500 includes: a target locater 
510 that determines in real time the locations of the target(s) 210 relative 
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to the reference structures 230 or 232 within the periodically moving 
anatomical region; a 4D treatment planning generator 520 that generates 
a 4D treatment plan as a function of the relative position in time within P 
for the moving anatomical region; and a treatment beam generator 524 
that delivers therapeutic radiation to the targets in real time in accordance 
with the treatment plan. The treatment plan prescribes a desired radiation 
dose distribution to be delivered in real time to the targets, while 
accounting for a deformation of the moving anatomical region during the 
periodic motion. The treatment planning generator 520 may be connected 
to a treatment delivery controller (not shown) which controls the delivery 
of radiation, in accordance with the treatment plan generated by the 
treatment planning generator 520. 

[00136] Also included may be a robot system (not shown), which 
typically has a fixed base and an articulated arm assembly at the distal 
end of which the treatment beam generator (e.g. x-ray source such as a 
linac) may be mounted. The robot system may move (and orient), in 
response to the directions of the delivery controller, the treatment beam 
generator (i.e. x-ray linac). The robot system and treatment beam 
generator are described in detail, for example in commonly owned U.S. 
Patent No. 5,207,223, and U.S. Patent Application Serial No. 10/814,451 
characterized by attorney docket No. ACCL-120, both incorporated by 
reference herein in their entireties. 

[001 37] In one embodiment, the target locater 51 0 includes a 3D 
scanner 520; a DRR generator 530; an x-ray imaging system 540; an 
image registration subsystem 550; one or more motion sensors 560; a 4D 
model generator 570; and a target location computer 580. The 3D 
scanner 520 generates a plurality of 3D images lj Q = 1 , . . . , p) of the 
anatomical region, at each of a succession of time points tj G s 1, . . . . p) 
within the cycle P. These 3D images may include, but are not limited to: 
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a 3D CT scan; a 3D MRI scan; a 3D PET (positron emission tomography) 
scan; and a 3D ultrasound scan. The 3D scanner 520 can therefore be 
one of: a 3D CT scanner; a 3D PET scanner; a 3D MRI scanner; and a 
3D ultrasound scanner. 

[001 38] The time points tj (j = 1 p) are preferably chosen to 

substantially encompass a full range of the periodic motion of the 
anatomical region. For example, the time points may include: a first time 
point corresponding to a peak of the cycle P, a second time point 
corresponding to a trough of the cycle P, and a third time point disposed 
at an intermediate location between the peak and the trough of the cycle 
P. The motion sensors 560, which may be surface markers; for example, 
monitor the periodic motion of the anatomical region, and measure the 
cycle P. In this way, the motion sensors 560 generate time position data 
representative of the relative position within P of one or more desired time 
points. 

[001 39] The DRR generator 530 reconstructs DRRs from the 3D 
scan data, at each time point tj, by casting hypothetical rays through the 
volumetric 3D scan data from a known beam projection geometry, and 
integrating the 3D scan data along each ray. The x-ray imaging system 
540 generates near real time 2D x-ray projection images of the targets 
210 and the reference structures 230 and 232 within the moving 
anatomical region, by detecting x-ray imaging beams after the beams 
have traversed at least a portion of the anatomical region. These x-ray 
imaging beams are generated from the same beam projection geometry 
as used to generate the DRRs. 

[00140] The image registration subsystem 550 registers the near 
real time x-ray projection images of the reference structures and/or the 
targets, with the DRRs of the reference structures and/or the targets, 
thereby determining the locations of the reference structures and/or the 

46 



targets. In one embodiment, the image registration subsystem 550 
includes: 1) an ROI selector 620 configured to select an ROI (region of 
interest) within the DRR, the ROI containing the treatment target and 
preferably at least one reference structure; 2) an image enhancer 630 
configured to enhance the DRRs and the x-ray images by applying a filter 
operator to the DRR and to the x-ray image; 

3) a similarity measure calculator 640 configured to determine a measure 
of similarity between the DRR and the x-ray image; 4) a motion field 
generator 650 configured to generate a 3D full motion field by estimating, 
for each of a plurality of resolution levels, one or more 2D local motion 
fields within the ROI, using the similarity measure; and 
5) a parameter determiner 660 configured to determine a set of non-rigid 
transformation parameters that represent the difference in the position 
and orientation of the treatment target as shown in the x-ray image, as 
compared to the position and orientation of the treatment target as shown 
in thb DRR, from the 3D full motion field. 

[00141] The 4D model generator 570 generates a 4D model that 
describes a motion of the targets 210 relative to the reference structures 
232 within the moving anatomical region, as a function of the relative 
position in time within the cycle P. The target location computer 580 
computes the locations of the targets at the one or more desired time 
points. The target location computer 580 uses the 4D model constructed 
by the 4D model generator 570, to correlate the locations of the targets 
with the known locations of the reference structures, as determined by the 
image registration subsystem 550, and uses the time position data 
obtained by the motion sensors 560 to determine the relative position 
within P of each desired time point. 

[00142] In one embodiment, the 4D model generator 570 includes a 
deformation model constructor 575 configured to construct a 
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mathematical model that describes the deformation and motion of the 
anatomical region, as a function of the relative position in time within the 
cycle P. In this embodiment, the 4D model generator 570 derives the 4D 
model from the mathematical model constructed by the deformation 
model constructor 575. The deformation model constructor 575 extracts, 
from the plurality of images lj generated by the 3D scanner 520, 
deformation data that contain information relating to the deformation and 
motion of the anatomical region. 

[00143] In one embodiment, the deformation model constructor 575 
extracts the deformation data from the plurality of images by registering 
each image that is taken at a time point tj within P, onto a consecutive 
image that is taken at a consecutive time point t j+1 within P. The 
information contained in the deformation data comprises information 
relating to the change in the position and orientation of the targets relative 
to the reference structures. The deformation model constructor 575 uses 
the deformation data, together with the time position data from the motion 
sensors 560 to mathematically correlate, for each time point tj, the relative 
position within P of the time point tj with the deformation and motion of the 
targets at the time point tj. 

[00144] Using the 4D mathematical model generated by the 4D 
model generator 570, the results of non-rigid image registration as 
performed by the image registration subsystem 550, and the relative 
position in time within the breathing cycle as determined by the motion 
sensors 560, the target location computer 580 computes the locations of 
the tumors/lesions. 

[00145] Using the 4D mathematical model generated by the 4D 
model generator 570, the 4D treatment plan generator 520 generates a 
desired radiation dose distribution that results from continuous radiation 
beam delivery through the non-rigidly moving anatomical region. Finally, 
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the treatment beam generator 524 generates and treatment beams in 
accordance with the desired radiation dose distribution, and delivers them 
in real time to the targets. 

[00146] In sum, a number of techniques have been described for 
dynamically tracking tumors/lesions in the anatomy that move, for 
example due to periodic motion such as respiration. These techniques 
combine 4D treatment planning, fiducial-less tracking of skeletal 
structures or targets, and dynamic tracking of surface markers with 
pertinent mathematical models, to achieve dynamic tracking of the 
tumors/lesions of interest. 

[00147] While the invention has been particularly shown and 
described with reference to specific embodiments, it should be understood 
by those skilled in the art that many modifications and variations in form 
and detail may be made in the techniques and structures described and 
illustrated herein, without departing from the spirit and scope of the 
invention. Accordingly, the techniques and structures described and 
illustrated herein should be understood to be illustrative only and not 
limiting upon the scope of the present invention. The scope of the present 
invention is defined by the claims, which includes known equivalents and 
unforeseeable equivalents at the time of filing of this application. 
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What is claimed is: 

1 . A method of dynamically tracking one or more targets within an 
anatomical region in order to deliver therapeutic radiation to the targets 
during a motion of the anatomical region, the anatomical region including 
one or more reference structures, the method comprising: 

determining in real time the locations of the targets relative to the 
reference structures within the moving anatomical region; 
and 

generating a radiation dose distribution that prescribes a desired 
amount of the therapeutic radiation to be delivered in real time to the 
targets within the moving anatomical region, the dose distribution 
accounting for a deformation of the moving anatomical region during the 
motion. 

2. A method in accordance with claim 1 , wherein the motion 
comprises a periodic motion characterized by a cycle P. 

3. A method in accordance with claim 2, 

wherein the act of determining in real time the locations of the 
targets relative to the reference structures within the moving anatomical 
region comprises: 

constructing a 4D mathematical model that correlates a 3D motion 
of the targets relative to a 3D motion of the reference structures, as a 
function of the relative position in time within the cycle P, and accounts for 
a non-rigid deformation of the anatomical region during the cycle P; 

dynamically tracking the periodic motion of the anatomical region 
so that the relative position within the cycle P of one or more time points 
can be determined in real time; 
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registering one or more near real time images of the reference 
structures with one or more pre-operative images of the reference 
structures, to derive the locations of the reference structures in near real 
time; 
and 

using the near real time locations of the reference structures, the 
4D mathematical model, and the relative position within P of the time 
points, to compute the locations of the targets in real time. 

4. A method in accordance with claim 3, wherein the act of 
constructing the 4D mathematical model that describes the deformation 
and motion of the anatomical region comprises: 

generating a plurality of images lj (j = 1 p) of the anatomical 

region, each image being taken at one of a succession of time points tj (j = 
1, . . . , p) within the cycle P; 

measuring the cycle P, and the relative position within P of each 
time point tj; and 

constructing a mathematical model from which an image lj can be 
derived, for any time point tj, and from which the position of the target 
relative to the reference structure can be measured at any time tj in the 
corresponding image lj. 

5. A method in accordance with claim 4, wherein the act of measuring 
the cycle P and the relative position within P of each time point tj 
comprises attaching an external device to at least a portion of the 
anatomical region in order to sense the change in location and orientation 
of the portion of the anatomical region as a function of the instant in time 
within the cycle P. 
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6. A method in accordance with claim 5, wherein the motion of the 
anatomical region comprises a respiration, wherein the cycle P comprises 
a respiratory cycle, and wherein the external device comprises a 
breathing sensor. 

7. A method in accordance with claim 5, wherein the motion of the 
anatomical region comprises a heartbeat, wherein the cycle P comprises 
a heartbeat cycle, and wherein the external device comprises a heartbeat 
monitor. 

8. A method in accordance with claim 4, wherein the plurality of 
images taken at the time points tj comprise at least one of: a 3D CT scan; 
a 3D MRI (magnetic resonance imaging) scan; a 3D PET (positron 
emission tomography) scan; and a 3D ultrasound scan. 

9. A method in accordance with claim 1 , further comprising the step of 
delivering therapeutic radiation to the targets in real time in accordance 
with the radiation dose distribution, while the anatomical region undergoes 
the motion. 

10. A method in accordance with claim 4, wherein the time points 1} fl = 

1 p) substantially encompass a full range of the periodic motion of 

the anatomical region. 

11. A method in accordance with claim 10, wherein the time points 
comprise at least: a first time point corresponding to a peak of the cycle P 
and a second time point corresponding to a trough of the cycle P. 
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12. A method in accordance with claim 4, further comprising the acts of: 

obtaining, from the plurality of images l jf deformation data that 
contain information relating to the deformation and motion of the 
anatomical region, at each time point tj within the cycle P, and 

for each time point tj, using the deformation data and the measured 
cycle P to mathematically correlate the relative position within P of the 
time point ^ with the deformation and motion of the targets at the time 
point tj; and 

wherein the act of obtaining the deformation data from the plurality 
of images comprises: 

morphing each image that is taken at a time point tj within P, onto a 
consecutive image that is taken at a consecutive time point t j+1 within P. 

1 3. A method in accordance with claim 4, wherein the information 
contained in the deformation data comprises information relating to the 
change in the position and orientation of the targets relative to the 
reference structure. 

14. A method in accordance with claim 1, wherein the targets comprise 
at least one of a tumor and a lesion, and wherein the reference structures 
comprise skeletal structures. 

15. A method in accordance with claim 1, wherein the desired radiation 
dose distribution is indicative of a therapeutically effective amount of 
radiation to be delivered to the one or more targets, at each of the desired 
time points. 

16. A method in accordance with claim 4, 
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wherein the 4D mathematical model, correlates the 3D location of 
the reference structures with the 3D location of the targets, at any time 
point t) within the cycle P. 

17. A method in accordance with claim 16, 

wherein at least one reference structure is substantially stationary 
throughout the cycle P, and 

wherein the 4D mathematical model describes the 3D location of 
the targets with respect to the substantially stationary reference structure 
as a function of time in the cycle P. 

1 8. A method in accordance with claim 1 6, 

wherein at least one reference structure moves with the periodic 
motion of the patient; 
and 

wherein the 4D mathematical model describes the 3D location of 
the targets as a compound function of: i) the time the cycle P; and ii) the 
changing locations of the moving reference structure. 

1 9. A method in accordance with claim 4, wherein the act of generating 
the radiation dose distribution comprises: 

defining one or more radiation beam trajectories with respect to a 
nominal patient co-ordinate system, defined within an image lj taken at 
time point tj; 

for each beam, calculating the duration of the beam that would 
result in the delivery by the beam of a desired dose of radiation, if the 
beam were turned on from the time point tj onward; and 

calculating as a function of time the dose absorption resulting from 
the turning of the each beam at the time point tj, when the anatomical 
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region undergoes the deformation and motion described in the 
mathematical model. 

20. A method in accordance with claim 3, wherein the act of 
dynamically tracking the periodic motion of the anatomical region 
comprises: 

attaching one or more surface markers to at least a portion of a 
surface of the anatomical region; and 

monitoring in real time the motion of the surface markers in order to 
measure the cycle P, to determine the real time locations of the surface 
markers, and to determine the change in location and orientation of the 
portion of the anatomical region as a function of the position in time within 
the cycle P. 

21 . A method in accordance with claim 20, wherein the surface 
markers comprise LEDs (light emitting diodes). 

22. A method in accordance with claim 3, wherein the act of registering 
the near real time images of the reference structures with one or more 
pre-operative 3D scan data of the reference structures comprises: 

a) reconstructing a plurality of DRRs from the pre-operative 3D scan 
data; 

b) for each of a plurality of image resolution levels, estimating a 
plurality of 2D local motion fields, using a similarity measure between the 
DRRs and the near real time images; 

c) deriving a full motion field from the plurality of 2D local motion fields; 
and 

d) determining from the full motion field a set of non-rigid 
transformation parameters that represent the difference in the position 
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and orientation of the reference structures as shown in the near real time 
images, as compared to the position and orientation of the reference 
structures as shown in the DRRs. 

23. A method in accordance with claim 22, wherein the act of 
determining the near real time locations of the targets comprises 
constructing one or more 3D motion field vectors from the locations of the 
reference structures. 

24. A method in accordance with claim 20, wherein the act of 
computing the real time locations of the targets comprises: 

using the results of image registration between the near real time 
images and the pre-operative images of the reference structures to 
determine the locations of the reference structures; 

using the 4D mathematical model, and the location of the reference 
structures determined by the results of image registration, to find the 
corresponding locations of the targets; 

and 

using the real time locations of the surface markers, determined in 
claim 20, to construct a marker-target correlation model that relates the 
real time locations of the surface markers to the real time locations of the 
targets. 

25. A method in accordance with claim 20, wherein the act of 
computing the real time locations of the targets comprises: 

using the results of image registration between the near real time 
images and the preoperative images of the reference structures to 
determine the locations of the reference structures; 
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interpolating from the. locations of the reference structures to locate 
the targets; 
and 

using the real time locations of the surface markers, determined in 
real time in claim 20, to construct a marker-target correlation model that 
relates the real time locations of the surface markers to the real time 
locations of the targets. 

26. A method of dynamically tracking one or more targets within an 
anatomical region and delivering therapeutic radiation to the targets, while 
the anatomical region undergoes a periodic motion characterized by a 
motion cycle P, the anatomical region including one or more reference 
structures, the method comprising: 

monitoring the periodic motion to measure the cycle P; 
determining in real time the locations, within the moving anatomical 
region, of the targets relative to the reference structure; 
and 

delivering radiation treatment in real time to the targets, as the 
anatomical region undergoes the periodic motion. 

27. A method in accordance with claim 26, wherein the act of 
determining in real time the locations of the targets relative to the 
reference structures comprises: 

constructing a 4D mathematical model, which describes a non-rigid 
deformation of the anatomical region, as a function of time within the cycle 
P, and which describes a 3D motion of the targets relative to the reference 
structures, as a function of the time within the cycle P; 

determining the real time locations of the reference structures at 
each of a plurality of time points t,, i = 1 N, by registering near real 
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time images of the reference structures with pre-operative images of the 
reference structures, and monitoring the periodic motion of the anatomical 
region to determine relative position within the cycle P of each time point to 

and 

computing the real time locations of the targets at each time point t|, 
using the near real time locations of the reference structures and the 
known position within the cycle P of each time point t|. 

28. A method of dynamically tracking one or more targets within an 
anatomical region and delivering therapeutic radiation to the targets, while 
the anatomical region undergoes a periodic motion characterized by a 
cycle P, the anatomical region including at least one reference structure, 
the method comprising: 

constructing a 4D mathematical model that describes a non-rigid 
deformation of the anatomical region, and a 3D motion of the targets 
relative to the reference structure, as a function of the time within the 
cycle P; 

using the mathematical model to generate a desired dose 
distribution for delivery of therapeutic radiation to the targets; 

registering one or more near real time images of the reference 
structure with one or more pre-operative images of the reference structure, 
and dynamically tracking the periodic motion of the anatomical region to 
determine the relative position within the cycle P of desired time points, to 
derive the real time locations of the reference structure; 

using the real time locations of the reference structure, the 4D 
model, and knowledge of the relative position within the cycle P of desired 
time points, computing the real time locations of the targets; 
and 
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delivering therapeutic radiation to the real time locations of the 
targets, in accordance with the desired dose distribution. 

29. A method of dynamically tracking, and delivering therapeutic 
radiation to, one or more targets within an anatomical region of a patient 
while the patient undergoes a periodic motion characterized by a cycle P, 
the anatomical region including at least one reference structure, the 

method comprising: 

constructing from pre-operative scan data of the target a 4D 
mathematical model that describes a non-rigid deformation of the 
anatomical region as a function of time within the cycle P, and describes a 
3D motion of the targets relative to the reference structure; 

generating from the mathematical model a radiation dose 
distribution for delivering therapeutic radiation onto the one or more 
targets throughout the cycle P; 

performing non-rigid image registration of near real time 2D images 
of the reference structure with pre-operative 3D scans of the reference 
structure; 

determining the relative position within the cycle P of at least one 
time point, using one or more tracking devices attached to at least a 
portion of the anatomical region; 

computing in real time the locations of the targets at the at least 
one time point, using results of the non-rigid image registration and the 4D 
model; 

and 

delivering therapeutic radiation to the real time locations of the 
targets at the at least one time point within the cycle P, in accordance with 
the desired dose distribution. 
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30. A method in accordance with claim 29, wherein the tracking device 
comprises at least one of: 

a laser scanner; 

a video system; 

an ultrasound scanner; and 

an electrocardiogram. 

31 . An apparatus for dynamically tracking one or more targets within 

an anatomical region in order to deliver therapeutic radiation to the targets, 
the anatomical region including one or more reference structures and 
undergoing a motion, the apparatus comprising: 

a target locater configured to determine in real time the locations of 
the targets relative to the reference structures within the periodically 
moving anatomical region; 

a 4D treatment planning generator configured to generate a 4D 
treatment plan as a function of time for the moving anatomical region, the 
treatment plan prescribing a desired radiation dose distribution to be 
delivered in real time to the targets, and accounting for a deformation of 
the moving anatomical region during the motion. 

32. An apparatus in accordance with claim 31 , wherein the motion is a 
periodic motion characterized by a periodic cycle P. 

33. An apparatus in accordance with claim 32, 
wherein the target locater comprises: 

a) a 3D scanner configured to generate a plurality of 3D images lj (j = 
1 , .... p) of the anatomical region, at each of a succession of time points 
tj 0 = 1 , . . - , p) within the cycle P; 
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b) a DRR generator configured to reconstruct DRRs from the 3D scan 
data, at each time point tj f by casting hypothetical rays through the 
volumetric 3D scan data and integrating the 3D scan data along each ray; 

c) an x-ray imaging system configured to generate near real time 2D 
x-ray projection images of the targets and the reference structures within 
the moving anatomical region; 

d) an image registration subsystem configured to register the near 
real time x-ray projection images of the reference structures with the 
DRRs of the reference structures, thereby determining the locations of the 
reference structures; 

e) one or more motion sensors configured to monitor the periodic 
motion of the anatomical region and to measure the cycle P, so as to 
generate time position data representative of the relative position within P 
of one or more desired time points; 

f) a 4D model generator configured to generate a 4D mathematical 
model that describes a motion of the targets relative to the reference 
structures within the moving anatomical region, as a function of time 
within the cycle P; 

and 

g) a target location computer configured to compute in real time the 
locations of the targets at the one or more desired time points, by using 
the 4D mathematical model to correlate the locations of the targets with 
the known locations of the reference structures, as determined by the 
image registration subsystem, and by using the time position data to 
determine the relative position within P of each desired time point. 

34. An apparatus in accordance with claim 33, 

wherein the 4D model generator comprises a deformation model 
constructor configured to construct a mathematical model that describes 
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the deformation and motion of the anatomical region, as a function of the 
position in time within the cycle P; 

wherein the deformation model constructor is configured to extract, 
from the plurality of images lj generated by the 3D scanner, deformation 
data that contain information relating to the deformation and motion of the 
anatomical region; and 

wherein the deformation model constructor is configured to use the 
deformation data and the time position data to mathematically correlate, 
for each time point tj, the relative position within P of the time point tj with 
the deformation and motion of the targets at the time point tj. 

35. An apparatus in accordance with claim 32, wherein 

the motion of the anatomical region comprises respiration, and the cycle P 
comprises a respiratory cycle. 

36. An apparatus in accordance with claim 35, wherein the motion 
sensors comprise a breathing sensor. 

37. An apparatus in accordance with claim 32, wherein the motion of 
the anatomical region comprises a heartbeat, the cycle P comprises a 
heartbeat cycle, and the motion sensors comprise a heartbeat monitor. 

38. An apparatus in accordance with claim 33, wherein the plurality of 
images taken at the time points tj comprise at least one of: a 3D CT scan; 
a 3D MRI (magnetic resonance imaging) scan; a 3D PET (positron 
emission tomography) scan; and a 3D ultrasound scan. 

39. An apparatus in accordance with claim 32, further comprising a 
treatment beam generator configured to deliver therapeutic radiation to 
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the targets in real time in accordance with the radiation dose distribution, 
while the anatomical region undergoes the periodic motion. 

40. An apparatus in accordance with claim 32, wherein the sensor 
comprises at least one of: 

a surface marker; 

a laser scanner; 

a video system; 

an ultrasound scanner; and 

an electrocardiogram. 

41 . An apparatus in accordance with claim 33, wherein the time points 
tj G = 1 . - • - 1 P) are chosen to substantially encompass a full range of the 
periodic motion of the anatomical region. 

42. An apparatus in accordance with claim 41 , wherein the time points 
comprise at least: a first time point corresponding to a peak of the cycle P, 
and a second time point corresponding to a trough of the cycle P. 

43. An apparatus in accordance with claim 35, wherein the act of 
extracting the deformation data from the plurality of images comprises: 

registering each image that is taken at a time point tj within P, onto 
a consecutive image that is taken at a consecutive time point t j+ i within P. 

44. An apparatus in accordance with claim 35, wherein the information 
contained in the deformation data comprises information relating to the 
change in the position and orientation of the targets relative to the 
reference structures. 
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45. An apparatus in accordance with claim 32, wherein the targets 
comprise at least one of a tumor and a lesion, and wherein the reference 
structures comprise skeletal structures. 

46. An apparatus in accordance with claim 32, wherein the desired 
radiation dose distribution is indicative of a therapeutically effective 
amount of radiation to be delivered to the one or more targets, at each 
desired time point. 

47. An apparatus in accordance with claim 32, wherein the 3D scanner 
comprises at least one of: 

a 3D CT scanner; a 3D PET scanner; a 3D MRI scanner; and a 3D 
ultrasound scanner. 

48. An apparatus for dynamically tracking one or more targets within 

an anatomical region in order to deliver therapeutic radiation to the targets, 
the anatomical region including one or more reference structures and 
undergoing a periodic motion characterized by a cycle P so that the 
relative positions between the targets and the reference structures change 
periodically with time, the apparatus comprising: 

a 3D scan data generator configured to generate volumetric 3D 
scan data, at each of a succession of time points Tj within the period P; 

one or more motion sensors coupled to the scan data generator 
and configured to generate output signals representative of the relative 
position in time within P of each desired time point Tj, so that the 3D scan 
data generator can be activated at pre-selected positions in time within P; 

a DRR generator configured to reconstruct DRRs, at each time 
point Tj, from the 3D scan data by casting hypothetical rays through the 
volumetric 3D scan data and integrating the scan data along each ray; 
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a 4D treatment planning processor configured to generate a 
treatment plan as a function of the relative position in time within P for the 
moving anatomical region, in which the relative positions between the 
targets and the reference structure change periodically with time, the 
treatment plan representing a desired dose distribution of therapeutic 
radiation to be delivered to the moving targets as the anatomical region 
undergoes the periodic motion; 

wherein the 4D treatment planning processor is configured to 
generate, in response to receipt of the DRRs at each time point Tj, 
parameters Pj representative of a non-rigid deformation of the anatomical 
region, as shown in the DRR generated at Tj, when compared to the DRR 
generated at Tj.i, and to describe in a mathematical model these 
parameters Pj as a function of the relative position in time within P of the 
time point Tj, 

an x-ray imaging system configured to generate near real time 2D 
x-ray projection images of the targets and the reference structure within 
the moving anatomical region; 

an image registration processor configured to register the near real 
time x-ray projection images with the DRRs, so that the near real time 
locations of at least one of the reference structure and the targets can be 
determined throughout the periodic motion of the anatomical region; 

a treatment beam generator configured to generate, when 
activated, one or more beams x-rays having a strength sufficient to 
generate a therapeutic effect on the targets; 

one or more surface markers attached to at least a portion of the 
outer surface of the anatomical region, configured to sense the periodic 
motion of the anatomical region and to generate output signals 
representative of the relative position time within the period P of one or 
more desired time points; 
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a controller coupled to the treatment beam generator and 
responsive to the outputs of the 4D treatment planning processor, the 
image registration processor, and the surface markers, wherein the 
controller is configured to activate the treatment beam generator so as to 
deliver one or more beams of therapeutic radiation to the moving targets 
in accordance with the desired dose distribution, while the anatomical 
region undergoes the periodic motion. 

49, A method in accordance with claim 1 , wherein the act of 
determining in real time the locations of the targets relative to the 
reference structures within the moving anatomical region comprises: 

constructing a 4D mathematical model that correlates a 3D motion 
of the targets relative to a 3D motion of the reference structures, as a 
function of time, and accounts for a non-rigid deformation of the 
anatomical region during the motion; 

dynamically tracking the motion of the anatomical region so that the 
relative positions of one or more time points within a time interval can be 
determined in real time; 

registering one or more near real time images of the reference 
structures with one or more pre-operative images of the reference 
structures, to derive the locations of the reference structures; 
and 

using the locations of the reference structures, the 4D 
mathematical model, and the relative positions of the one or more time 
points within the time interval, to compute the locations of the targets in 
real time. 
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50. A method in accordance with claim 1 , wherein the deformation of 
the moving anatomical region comprises one of a non-rigid deformation 
and a rigid deformation. 

51. A method in accordance with claim 3, wherein the act of 
dynamically tracking the periodic motion of the anatomical region 
comprises: 

attaching a tracking device to at least a portion of a surface of the 
anatomical region; and 

monitoring in real time the tracking device in order to measure the 
cycle P, and to determine the change in location and orientation of the 
portion of the anatomical region as a function of the position in time within 
the cycle P. 

52. A method in accordance with claim 51 , wherein the tracking device 
comprises at least one of: 

a laser scanner; 

a video system; 

an ultrasound scanner; and 

an electrocardiogram. 
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FIG. 13 
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2D projections 
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FIG. 15 
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